Linear Coherent Bi-cluster Discovery via Line Detection and Sample Majority Voting
نویسندگان
چکیده
Discovering groups of genes that share common expression profiles is an important problem in DNA microarray analysis. Unfortunately, standard bi-clustering algorithms often fail to retrieve common expression groups because (1) genes only exhibit similar behaviors over a subset of conditions, and (2) genes may participate in more than one functional process and therefore belong to multiple groups. Many algorithms have been proposed to address these problems in the past decade; however, in addition to the above challenges most such algorithms are unable to discover linear coherent bi-clusters—a strict generalization of additive and multiplicative bi-clustering models. In this paper, we propose a novel bi-clustering algorithm that discovers linear coherent biclusters, based on first detecting linear correlations between pairs of gene expression profiles, then identifying groups by sample majority voting. Our experimental results on both synthetic and two real datasets, Saccharomyces cerevisiae and Arabidopsis thaliana, show significant performance improvements over previous methods. One intriguing aspect of our approach is that it can easily be extended to identify bi-clusters of more complex gene-gene correlations.
منابع مشابه
Linear Coherent Bi-cluster Discovery via Beam Detection and Sample Set Clustering
We propose a new bi-clustering algorithm, LinCoh, for finding linear coherent bi-clusters in gene expression microarray data. Our method exploits a robust technique for identifying conditionally correlated genes, combined with an efficient density based search for clustering sample sets. Experimental results on both synthetic and real datasets demonstrated that LinCoh consistently finds more ac...
متن کاملLinear Coherent Bi-Clustering via Beam Searching and Sample Set Clustering
We propose a new bi-clustering algorithm, LinCoh, for finding linear coherent bi-clusters in gene expression microarray data. Our method exploits a robust technique for identifying conditionally correlated genes, combined with an efficient density based search for clustering sample sets. Experimental results on both synthetic and real datasets demonstrated that LinCoh consistently finds more ac...
متن کاملSparse Learning Based Linear Coherent Bi-clustering
Clustering algorithms are often limited by an assumption that each data point belongs to a single class, and furthermore that all features of a data point are relevant to class determination. Such assumptions are inappropriate in applications such as gene clustering, where, given expression profile data, genes may exhibit similar behaviors only under some, but not all conditions, and genes may ...
متن کاملFinding Consistent Clusters in Data Partitions
Given an arbitrary data set, to which no particular parametrical, statistical or geometrical structure can be assumed, different clustering algorithms will in general produce different data partitions. In fact, several partitions can also be obtained by using a single clustering algorithm due to dependencies on initialization or the selection of the value of some design parameter. This paper ad...
متن کاملThe Role of Virtual News Networks on Voting Behavior (Case study: Political Science Students Islamic Azad University South Tehran Branch in 26 February 2016 Election)
Thepresent study aims to investigate the impact of the virtual news networks onpolitical participation of Iranians in parliamentary election on 26 February2016. The method of...
متن کامل